# Knowledge Distillation Optimization
F Lite 7B
Openrail
A 7-billion-parameter diffusion model jointly developed by Freepik and Fal, built through knowledge distillation with fast generation and efficient memory usage characteristics
Image Generation English
F
Freepik
342
22
Ultravox V0 3
MIT
Ultravox is a multimodal speech large language model based on Llama3.1-8B-Instruct and Whisper-small, capable of processing both speech and text inputs.
Audio-to-Text
Transformers English

U
FriendliAI
20
1
Ultravox V0 5 Llama 3 3 70b
MIT
Ultravox is a multimodal voice large language model built upon Llama3.3-70B and Whisper, supporting both voice and text inputs, suitable for scenarios like voice agents and translation.
Audio-to-Text
Transformers Supports Multiple Languages

U
fixie-ai
3,817
26
Bge M3 Distill 8l
An 8-layer embedding model distilled from BAAI/bge-m3, achieving 2.5x speed improvement while maintaining retrieval performance
Text Embedding
B
altaidevorg
249
7
Ultravox V0 4 1 Mistral Nemo
MIT
Ultravox is a multimodal model based on Mistral-Nemo and Whisper, capable of processing both speech and text inputs, suitable for tasks like voice agents and speech translation.
Audio-to-Text
Transformers Supports Multiple Languages

U
fixie-ai
1,285
25
Ultravox V0 4 1 Llama 3 1 70b
MIT
Ultravox is a multimodal speech large language model, built upon the pre-trained Llama3.1-70B-Instruct and whisper-large-v3-turbo backbones, capable of receiving both speech and text as inputs.
Text-to-Audio
Transformers Supports Multiple Languages

U
fixie-ai
204
24
Ultravox V0 4 1 Llama 3 1 8b
MIT
Ultravox is a multimodal speech large language model built on Llama3.1-8B-Instruct and whisper-large-v3-turbo, capable of processing both speech and text inputs.
Audio-to-Text
Transformers Supports Multiple Languages

U
fixie-ai
747
97
Polish Reranker Roberta V2
An improved Polish re-ranking model based on sdadas/polish-roberta-large-v2, trained with RankNet loss function and supports Flash Attention 2 acceleration
Text Embedding
Transformers Other

P
sdadas
961
2
Llama3.1 1B Neo BAAI 1000k
Apache-2.0
Llama3.1-Neo-1B-100w is an efficient language model pruned to 1.4B parameters from Meta-Llama-3.1-8B-Instruct and fine-tuned using the LLM-Neo method (combining LoRA and knowledge distillation). The training data consists of 1 million samples from BAAI/Infinity-Instruct.
Large Language Model
Transformers

L
yang31210999
39
2
Bangla Sentence Transformer
A Bangla sentence embedding model fine-tuned from stsb-xlm-r-multilingual, supporting sentence similarity calculation and semantic search
Text Embedding Supports Multiple Languages
B
shihab17
1,257
4
Distilbert Dot Tas B B256 Msmarco
A dual-encoder dot-product scoring architecture based on DistilBert, trained on the MSMARCO-Passage dataset with balanced topic-aware sampling, suitable for dense retrieval and candidate set re-ranking
Text Embedding
Transformers English

D
sebastian-hofstaetter
3,188
23
Mminilmv2 L6 H384 Distilled From XLMR Large
MiniLMv2 is a lightweight language representation model developed by Microsoft, achieving efficient performance through knowledge distillation technology.
Large Language Model
Transformers

M
nreimers
197
17
Distilbert Dot Margin Mse T2 Msmarco
DistilBERT-based dense retrieval model trained with knowledge distillation, suitable for passage re-ranking and direct retrieval tasks
Text Embedding
Transformers English

D
sebastian-hofstaetter
99
2
Featured Recommended AI Models